Skip to content

Conversation

@hmelder
Copy link
Contributor

@hmelder hmelder commented Nov 21, 2025

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include<stdio.h>

int main(void) {
	int ret = 0;
	@try {
	}
	@catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &, MachineDominatorTree &, const MachineDominanceFrontier &): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

@llvmbot llvmbot added backend:WebAssembly clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:mc Machine (object) code labels Nov 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-webassembly

Author: Hugo Melder (hmelder)

Changes

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include&lt;stdio.h&gt;

int main(void) {
	int ret = 0;
	@<!-- -->try {
	}
	@<!-- -->catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &amp;, MachineDominatorTree &amp;, const MachineDominanceFrontier &amp;): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

Full diff: https://github.com/llvm/llvm-project/pull/169043.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGObjCGNU.cpp (+10-4)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+2-1)
  • (modified) llvm/lib/MC/WasmObjectWriter.cpp (-3)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..3b9f9f306829d 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -179,8 +179,15 @@ class CGObjCGNU : public CGObjCRuntime {
       (R.getVersion() >= VersionTuple(major, minor));
   }
 
-  std::string ManglePublicSymbol(StringRef Name) {
-    return (StringRef(CGM.getTriple().isOSBinFormatCOFF() ? "$_" : "._") + Name).str();
+  const std::string ManglePublicSymbol(StringRef Name) {
+    auto triple = CGM.getTriple();
+
+    // Exported symbols in Emscripten must be a valid Javascript identifier.
+    if (triple.isOSBinFormatCOFF() || triple.isOSBinFormatWasm()) {
+      return (StringRef("$_") + Name).str();
+    } else {
+      return (StringRef("._") + Name).str();
+    }
   }
 
   std::string SymbolForProtocol(Twine Name) {
@@ -4106,8 +4113,7 @@ llvm::Function *CGObjCGNU::ModuleInitFunction() {
   if (!ClassAliases.empty()) {
     llvm::Type *ArgTypes[2] = {PtrTy, PtrToInt8Ty};
     llvm::FunctionType *RegisterAliasTy =
-      llvm::FunctionType::get(Builder.getVoidTy(),
-                              ArgTypes, false);
+        llvm::FunctionType::get(BoolTy, ArgTypes, false);
     llvm::Function *RegisterAlias = llvm::Function::Create(
       RegisterAliasTy,
       llvm::GlobalValue::ExternalWeakLinkage, "class_registerAlias_np",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 30d3e5293a31b..6cbec5e17ae1a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -8001,7 +8001,8 @@ ObjCRuntime Clang::AddObjCRuntimeArgs(const ArgList &args,
     if ((runtime.getKind() == ObjCRuntime::GNUstep) &&
         (runtime.getVersion() >= VersionTuple(2, 0)))
       if (!getToolChain().getTriple().isOSBinFormatELF() &&
-          !getToolChain().getTriple().isOSBinFormatCOFF()) {
+          !getToolChain().getTriple().isOSBinFormatCOFF() &&
+          !getToolChain().getTriple().isOSBinFormatWasm()) {
         getToolChain().getDriver().Diag(
             diag::err_drv_gnustep_objc_runtime_incompatible_binary)
           << runtime.getVersion().getMajor();
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index 15590b31fd07f..d882146e21b8a 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1794,9 +1794,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
       Flags |= wasm::WASM_SYMBOL_UNDEFINED;
     if (WS.isNoStrip()) {
       Flags |= wasm::WASM_SYMBOL_NO_STRIP;
-      if (isEmscripten()) {
-        Flags |= wasm::WASM_SYMBOL_EXPORTED;
-      }
     }
     if (WS.hasImportName())
       Flags |= wasm::WASM_SYMBOL_EXPLICIT_NAME;

@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-llvm-mc

Author: Hugo Melder (hmelder)

Changes

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include&lt;stdio.h&gt;

int main(void) {
	int ret = 0;
	@<!-- -->try {
	}
	@<!-- -->catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &amp;, MachineDominatorTree &amp;, const MachineDominanceFrontier &amp;): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

Full diff: https://github.com/llvm/llvm-project/pull/169043.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGObjCGNU.cpp (+10-4)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+2-1)
  • (modified) llvm/lib/MC/WasmObjectWriter.cpp (-3)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..3b9f9f306829d 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -179,8 +179,15 @@ class CGObjCGNU : public CGObjCRuntime {
       (R.getVersion() >= VersionTuple(major, minor));
   }
 
-  std::string ManglePublicSymbol(StringRef Name) {
-    return (StringRef(CGM.getTriple().isOSBinFormatCOFF() ? "$_" : "._") + Name).str();
+  const std::string ManglePublicSymbol(StringRef Name) {
+    auto triple = CGM.getTriple();
+
+    // Exported symbols in Emscripten must be a valid Javascript identifier.
+    if (triple.isOSBinFormatCOFF() || triple.isOSBinFormatWasm()) {
+      return (StringRef("$_") + Name).str();
+    } else {
+      return (StringRef("._") + Name).str();
+    }
   }
 
   std::string SymbolForProtocol(Twine Name) {
@@ -4106,8 +4113,7 @@ llvm::Function *CGObjCGNU::ModuleInitFunction() {
   if (!ClassAliases.empty()) {
     llvm::Type *ArgTypes[2] = {PtrTy, PtrToInt8Ty};
     llvm::FunctionType *RegisterAliasTy =
-      llvm::FunctionType::get(Builder.getVoidTy(),
-                              ArgTypes, false);
+        llvm::FunctionType::get(BoolTy, ArgTypes, false);
     llvm::Function *RegisterAlias = llvm::Function::Create(
       RegisterAliasTy,
       llvm::GlobalValue::ExternalWeakLinkage, "class_registerAlias_np",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 30d3e5293a31b..6cbec5e17ae1a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -8001,7 +8001,8 @@ ObjCRuntime Clang::AddObjCRuntimeArgs(const ArgList &args,
     if ((runtime.getKind() == ObjCRuntime::GNUstep) &&
         (runtime.getVersion() >= VersionTuple(2, 0)))
       if (!getToolChain().getTriple().isOSBinFormatELF() &&
-          !getToolChain().getTriple().isOSBinFormatCOFF()) {
+          !getToolChain().getTriple().isOSBinFormatCOFF() &&
+          !getToolChain().getTriple().isOSBinFormatWasm()) {
         getToolChain().getDriver().Diag(
             diag::err_drv_gnustep_objc_runtime_incompatible_binary)
           << runtime.getVersion().getMajor();
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index 15590b31fd07f..d882146e21b8a 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1794,9 +1794,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
       Flags |= wasm::WASM_SYMBOL_UNDEFINED;
     if (WS.isNoStrip()) {
       Flags |= wasm::WASM_SYMBOL_NO_STRIP;
-      if (isEmscripten()) {
-        Flags |= wasm::WASM_SYMBOL_EXPORTED;
-      }
     }
     if (WS.hasImportName())
       Flags |= wasm::WASM_SYMBOL_EXPLICIT_NAME;

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

🐧 Linux x64 Test Results

  • 111606 tests passed
  • 4467 tests skipped

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

@sunfishcode, I see that you are the original author of https://reviews.llvm.org/D62542. As @dschuff said in the review back then:

I'm hoping we can make that export behavior nicer soon; I find the attribute(used) -> export behavior a bit odd too. Once we drop fastcomp it will be easier to redefine EMSCRIPTEN_KEEPALIVE and other things.

This was in 2019, is this hack still required now that fastcomp is deprecated? Then problem with the current behaviour is that hidden no-strip symbols, added during codegen, are exported.

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

@davidchisnall the changes in codegen are trivial:

  1. Mangle public symbols with '$' instead of '.' as the latter is not a valid javascript identifier.
  2. Fix the function signature of class_registerAlias_np to return a bool instead of void.

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

Assuming that the new WASM exception implementation implements the mandatory functions and data structure of the Itanium EH ABI correctly, not much needs to be done to get EH working with libobjc2. I just need to find the root course of the crash...

Copy link
Contributor

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Objective-C bits look fine to me, the MC bit possibly should be a separate PR.

@github-actions
Copy link

github-actions bot commented Nov 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 28, 2025
Copy link
Contributor

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM once clang-format is happy.

We probably should have a test in tests/CodeGenObjC checking that the mangling is correct for WAsm.

@dschuff
Copy link
Member

dschuff commented Dec 5, 2025

+cc @sbc100 about issue 1 and @aheejin about issue 2.

Pardon my ignorance about objc here... When you say "explicitly hidden" what do you mean exactly? Do you mean something like __attribute__((visibility("hidden"))?

On exception handling, I took a quick look at the IR output of your example. The cc1 command includes
-target-feature +exception-handling -target-feature +multivalue -target-feature +reference-types -exception-model=wasm -mllvm -wasm-enable-eh -fobjc-exceptions -fexceptions -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh which looks approximately right. The IR output uses the @__gnustep_objc_personality_v0 and uses objc exception runtime functions including @objc_begin_catch(ptr %exn). Probably the exception handling ABI for objc is going to have to be tuned the way the libc++ EH ABI was, which will probably take some small tweaks in the frontend to have the same behavior, and some larger tweaks in the runtime, as there was with libc++ and libc++abi.
I believe all of our wasm-specific change to the EH runtime have been upstreamed by @aheejin so you can take a look at the wasm-specific code there to get an idea of what it would take for objc.


// Exported symbols in Emscripten must be a valid Javascript identifier.
auto triple = CGM.getTriple();
if (triple.isOSBinFormatCOFF() || triple.isOSBinFormatWasm()) {
Copy link
Member

@dschuff dschuff Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restriction on valid JS identifiers is specific to Emscripten rather than wasm as a whole, so you might want to check for isOSEmscripten here rather than the bin format. But if you want to have a common ABI across Emscripten and WASI, then this would be OK with me too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:WebAssembly clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants